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Sentiment analysis was a system for recognizing and extracting opinions in 
documents. There were two weaknesses in sentiment analysis. The first 
weakness was preprocessing in sentiment analysis can’t recognize slang 
words so that important words that should have been recognized became 
unrecognizable. The Second was the feature extraction process in sentiment 


analysis only recognized words based on the form of the word but can’t 


recognize the similar word. In this paper, we proposed spelling checker and 


Keywords: wordnet to fix these weaknesses. We also used k-nearest neighbor (KNN), 
Bah Naive Bayes, and decision tree as methods for check classify the text. The 
S anasa Isis sl d purpose of this research was first to know the effects of used Wordnet and 
entiment analysis slang words spelling checkers in sentiment analysis and second was to improve the data 
wordnet processing process in the existing sentiment analysis. The dataset that we 
Spelling checker used in the research was a list of tweets in Bahasa. The results showed 
wordnet and spelling checker make a decrease in the valued of false 
positives, false negatives, and true negatives in the calculation of the 
confusion matrix. It can increase the accuracy of the K-NN from 43% to 
72%, Naive Bayes from 41% to 71% and decision tree from 47% to 75%. 
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1. INTRODUCTION 

Sentiment analysis labels a text or some texts as expressing either a positive or negative opinion in a 
document and can be considered the challenge of building a classification from the text [1]. Because 
sentiment analysis consists of many opinion datasets, the dataset of sentiment analysis consists of non- 
standard or unstructured data, thus making the text classification process in the data set inaccurate. There are 
two weaknesses in sentiment analysis. The first weakness is slang words can make sentiment analysis can't 
process raw data. Slang is a type of language of non-standard words and phrases [2]. Slang words can 
interfere with the algorithm process in sentiment analysis to process datasets in social media texts [3]. We 
provide an illustrative example of the effect of slang words in sentiment analysis, an example is a tweet that 
contains slang words like “at de moment he cnt just put me in da better zone thoughhhh happy bday mic, ur a 
legend”. The tweet contains some slang words, and you will recognize some terms which doesn’t belong to 
decent vocabulary, so that the tweet is not suitable to be used as a dataset for classifying texts in analysis, 
because it must be used as a standard first for each word. Second is not every list of words in the dataset is 
easy to classify by counting words with the dictionary. For example, two words “beautiful” and “pretty” in 
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the sentence “She's beautiful, but her sister's not pretty”. If humans read the sentence, the two words in the 
sentence have two different meanings, because the word “pretty” is preceded by the word “not”, human 
readers would recognize these words. But algorithms in preprocessing that count words out of context would 
be fooled because the algorithm only recognizes words based on the existing dictionary, not the meaning of 
the whole sentence, so the algorithm only assumes that the two words have the same meaning [4]. 

In this research, provides innovation to correct the shortcomings of the sentiment analysis process 
that cannot process slang words and recognize words, synonyms and their opposites. Solutions for these 
weaknesses are to use the spelling checker to recognize slang words so that spelling checker can solve the 
non-word error problem [5] and wordnet to recognize words, synonyms and their opposites [6]. Wordnet is 
one of some dictionaries that contain a lexical database for sentiment analysis for English and it is manually 
built by a group of lexicographers at Princeton University [7]. Wordnet can calculate and expand the 
meaning of some words that have the same meaning as synonyms and antonyms [8]. Spelling checker is a 
process of detecting spelling errors or slang words in word writing using an existing language dictionary [9]. 
The spelling checker is very helpful in this research because can help recognize spelling errors or slang 
words. Spelling errors or slang words are words that are not in the language dictionary [10]. We propose 
k-nearest neighbor (KNN), Naive Bayes, and decision tree classification methods for evaluating it. The 
purpose of this research is to evaluate the effect of using wordnet and spelling checker with dataset Bahasa in 
sentiment analysis. This research contributes to correct the weaknesses of sentiment analysis that are 
currently unable to process documents containing slang words and similar words with wordnet and spelling 
checker. It is important because slang words can invalidate the calculation of the classification process in 
sentiment analysis. So that the calculations in the sentiment analysis process are incorrect. Our research is 
different from previous research because we developed sentiment analysis with spelling checker and 
wordnet. The effect of using spelling checker and wordnet is that it can reduce the value of false negatives, 
false positives and true negatives, thereby increasing the value of the accuracy of sentiment analysis. 


2. RELATED WORK 

In this paper, we have taken several reference journals and articles as references in our research. 
Spelling checker is very helpful in the process of eliminating slang words in the dataset in the process of 
sentiment analysis [5], [9]. Ababneh [11] explain that Naive Bayes is better than decition tree and k-NN. 
Soleh and Purwarianti [12] explains that slang words can make the calculation of the confusion matrix in 
sentiment analysis inaccurate. Dey ef al. [13] that explain Naive Bayes is better than other classification 
methods for hotel review case studies, Suhariyanto et al. [14] that explain support-vector machine (SVM) is 
better than other classification methods for detection fake movie [15], decition tree is better than Naive Bayes 
and k-NN [16], k-NN is better than Naive Bayes [17], k-nearest neighbor classifier is transparent, consistent, 
straightforward, simple to understand, high tendency to possess desirable qualities and easy to implement 
than most other machine learning techniques [18]. The process of extracting data sets on social media is more 
suitable for using unsupervised machine learning than supervised learning [19], subversion (SVN) has the 
lowest accuracy than k-NN and Naive Bayes [20], sentiment analysis is widely applied in fields related to 
opinions describing satisfaction or dissatisfaction such as case studies of tourist reviewers, film reviewers and 
the like [21], [22]. It is these various case studies that produce many conclusions, because each of these cases 
also uses various datasets, both training data and testing data, so that they have the potential to produce 
various kinds of conclusions. The interesting thing about the related work that we studied is that there are 
studies that say that slang or non-standard words can interfere with and falsify natural language processing 
tasks carried out on social media texts and sentiment analysis algorithm is currently not able to detect the 
similarity of words in the training and testing data so that the calculation process in sentiment analysis 
becomes less valid. Several studies have mentioned that spelling checkers can improve slang words into 
standard words [23], [24]. It is very good because by improving slang words in training data and testing data 
in sentiment analysis, the calculation process becomes better and more valid [25]. Wordnet is a language 
dictionary to find out the similarities, synonyms and antonyms of a word [26]. Wordnet can be applied in 
natural language processing (NLP) to search for synonyms so that sentiment analysis can identify synonyms 
and opposites in the dataset. The similarity of our research with research is that we want to learn about 
sentiment analysis. The reason we study sentiment analysis is because we believe that the methods in 
sentiment analysis can be further improved in terms of accuracy. The difference in sentiment analysis that we 
studied compared to sentiment analysis that has been done by studies is that we want to use spelling checker 
and wordnet before the preprocessing process in sentiment analysis, because based on the research we have 
done, spelling checker and wordnet can reduce false negative, false positive and true negative values in the 
confusion matrix in sentiment analysis. This makes the accuracy, precision, recall and F1 score values 
increase compared to sentiment analysis that does not use spelling checker and wordnet. 
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3. METHOD 

The research methodology starts from building a framework in the process of solving problems. The 
framework of this research consists of two processes. The first process is spelling checker and wordnet 
process. The second process is text classification. 


3.1. Slang word 

Slang words is a non-standard language for easier communication and instantly in social group, so 
that when the researcher wants to conduct sentiment analysis, the dataset that contains slang words will 
interfere with the text classification process [27]. A dataset containing slang words will interfere with the text 
classification process, especially in the calculation process in the confusion matrix, because in the confusion 
matrix formula to find the precision, recall, F1 score and accuracy values, it requires false positive, false 
negative and true negative values. If the slang word is still in the dataset, it can cause the accuracy, precision, 
recall and F1 score values to be invalid, because the true positive (TP), true negative (TN), false negative 
(FN), and false positive (FP) values do not match the data in the dataset [28]. 


3.2. Wordnet process 

Semantic similarity measure is a problem that often occurs in artificial intelligence. It has been 
widely used in NLP, information retrieval, word sense disambiguation, recommended system, and 
information extraction. Wordnet can solve this problem. It is a lexical database of English. In wordnet nouns, 
verbs, adverbs and adjectives are organized by a variety of semantic relations into synonym sets, which 
represent one concept. Semantic relations used by wordnet are antonyms, synonyms and hyponyms [29]. The 
wordnet derived knowledge base makes semantic knowledge available which can be used in overcoming 
many problems associated with the richness of natural language. A measure of semantic similarity is very 
suitable to be used as an alternative to pattern matching in the comparison process in sentiment analysis [30]. 


3.3. Spelling checker and wordnet in preprocessing 

In this paper, the spelling checker and word expansion using wordnet were used before the pre- 
processing and confusion matrix processes. The result of this process is a dataset that does not contain slang 
words and a dataset where every word has a synonym and antonym, so that the words from the dataset and 
the training data can be identified in the confusion matrix. Details of the process can be seen in the Figure 1. 
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Figure 1. Flow chart spelling checker process and wordnet in preprocessing 
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Figure 1 is presented, a flow process of spelling checker and wordnet. The following steps are taken: 
i) The first process is to input list slang words and list tweets. Example, we have a list of tweets in Bahasa 
“makanan ini endess sekali”. Because the tweet contains one slang word “endess” the algorithm will replace 
the word from “endess” to be “enak”; ii) the second process is the algorithm will search slang words "endess" 
in the list of slang words and replace it be “enak”. The result from the spelling checker process is list tweet 
without slang words “makanan ini enak sekali”; iii) the third process is the expansion of each word using 
wordnet to search for every synonym and antonym of each word in the tweet; and iv) After expanding the 
word using wordnet, you will get the best documents containing two datasets. The first dataset is a list of 
synonyms and antonyms of each word in each tweet. These synonyms and antonyms will be grouped into 
positive and negative words. The second is a list of tweets that do not contain slang words that will be tested 
for the text classification process in sentiment analysis. 


3.4. Text classification 

Text classification is the process of classifying a large text document to obtain important 
information in the document [31]. Text classification can help to evaluate a case study related to the text such 
as the classification process for film reviews, hotel reviews and news reviews. Classification text consists of 
several processes such as preprocessing, feature extraction and classification [32]. Details of the text 
classification process can be seen in Figure 2. 
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Figure 2. Text classification process phases 


Figure 2 is presented, a flow process of text classification such as preprocessing, feature selection, 
indexing and classification. Preprocessing will contain several processes such as case folding, tokenizing, 
filtering and stemming [33]. Feature selection in this research using the term's frequency and inverse 
document frequency (TF-IDF) algorithm [34]. Feature extraction in this research using the TF-IDF 
algorithm. 

The result of the feature extraction of this research is TF-IDF weighting results for each tweets 
along with positive and negative label categories. Classification for this research is to determine the matrix 
confusion to find the true positive, negative true, false positive and falsenegative values that will be used in the 
calculation recall, precision, accuracy and F1 score [35]. Several textclassification algorithms from machine 


learning and data mining communities exist such as: Naïve Bayes, decision trees (DTs), and k-nearest 
neighbors (KNN). 


3.5. Data preprocessing 

Each dataset, both training data and testing data, will go through a preprocessing process consisting 
of several processes. Data preprocessing is the process of preparing the dataset that has been taken by 
eliminating the unused part so that only important data is produced which will later be used as a feature that 
will be used in the process of making sentiment analysis models [36]. Data preprocessing consists of several 
processes such as: 

a) Case folding, is a process of changing all the letters in a text into all lowercase letters. 

b) Tokenizing/parsing, the stage of cutting the input string based on each word that makes it up. In 
addition, spaces are used to separate the words. 

c) Filtering is the stage of taking words that are considered important from the results of tokenizing. 
filtering can use a stoplist algorithm (discarding less important words) or wordlist (save important 
words). 

d) Stemming is the process of changing a word into its standard form. 


3.6. Feature extraction 

The TF-IDF algorithm is a common method for feature extraction entries in the text classification 
process and it is simple and highly efficient algorithm for text representation [37]. TF-IDF is a statistical 
method for assessing the importance of a word for one document or one of some corpuses. TF-IDF is an 
algorithm that is used to calculate the text weight value based on the frequency and frequency of the reverse 
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document of a word or phrase which is a feature item TF/IDF has a formula based on the weighting value. 
The formula is can be seen in (1)-(3). 


TF —IDF = TF x logIDF (1) 
IDF = TF x logIDF (2) 
W = TF x IDF (3) 


The formula is the formula of the TF-IDF algorithm, where term's frequency (TF) represents the 
frequency of feature items, inverse document frequency (IDF) represents the frequency of inverted 
documents from feature items. When a word or phrase appears in all documents, the IDF value will be 1, and 
O makes the entire result zero after taking the logarithm. The TF-IDF term weighting is a combination of the 
standard TF calculation formula with the IDF formula by multiplying the TF value with the IDF value. 
weight (W) is the term weight (TF) of the document. 


3.7. Classification 

Based on the related work that we have studied, in this paper we will use k-NN, Naive Bayes, 
decision tree and to classify the results of our research. K-NN is a classifier in sentiment analysis that uses 
distance-based measures to classify the main idea is that documents that fall into the same class category are 
more likely to be "similar" or close to each other based on a similarity measure such as cosine defined in the 
dictionary [38]. DTs use the concept of a hierarchical structure to carry out the classification process. The 
decision tree algorithm process starts from the root node to the leaf node which is done recursively. The 
process in the decision tree is changing the shape of the data into a tree model and then changing the tree 
model into rules [39]. The algorithm used to build the decision tree is ID3. The algorithm in this decision tree 
uses the concept of calculating the entropy value. The concept of entropy is used to measure "how 
informative" a node is. The formula for finding entropy and information gain is as shown in: 


Entropy = ¥7_, —pi x log, pi (4) 


Gain(S, A) = Entropy(S) — XY, = x Entropy(Si) (5) 
Nave Bayes cassifier is a classification method based on Bayes theorem calculations. Naive Bayes is 

a machine learning method that utilizes the concept of probability values and statistical calculations [40]. The 

basic concept of Naive Bayes is to predict future opportunities based on values that have existed in the past, 


so it is known as Bayes Theorem [41]. 


3.8. Evaluation 

In evaluating the algorithm performance of our study, we use the reference confusion matrix. The 
confusion matrix represents the predictions and actual conditions of the data generated by the classification 
method algorithm in our study [42]. Based on the value of the confusion matrix, we can determine the value 
of the accuracy, precision, recall, and F1 scores from the sentiment analysis classification method. 


TP 


Precision = Gene (6) 
Recall = —?— (7) 
(TP+FN) 
TP+TN 
Accuracy = (TP+TN+FP+FN) (8) 


2 x Recall x Precision 
F1 Score = SA Tenson (9) 


Recall+Precision 


Recall and precision are often used to evaluate the effectiveness of machine learning algorithms. 
Accuracy is the ratio of true predictions (positive and negative) to the whole data. Precision value is true 
positive prediction ratio compared to overall positive predicted results. Recall is a true positive prediction ratio 
compared to overall true positive data. F1 score is a comparison of weighted average precision and recall. 
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4. RESULTS AND DISCUSSION 
4.1. Spelling checker 

After take 1026 tweets [43] from twitter with API Twitter. The 1026 tweets contain some slang 
words, so that we needed to correct it with spelling checker. We had got 113 slang words from training 
data and 139 slang words from testing data. The word calculation process after and before cleaning can be 
seen Table 1. 


Table 1. Slang words correction by spelling checker 


Dataset Slang words without spelling checker Slang words with spelling checker 
i Before After Before After 
Training Data 113 113 113 - 
Testing Data 139 139 139 - 
Total 242 242 242 0 


From Table 1, it could be seen that before the spelling checker process, training data had slang 
words totaling 113 words and testing data had slang words totaling 139 words. After the correction process 
was doneused the listed of slang word. The result was there were no more slang words in the training data 
and testing data. The Spelling Checker process will check the slang words in each dataset, both training data 
and testing data. Each slang words will be matched with the slang words dictionary to be converted into 
standard words, so the effect is that all slang words become non-existent or have a value of 0, because all 
slang words have been converted into standard words. 


4.2. Expansion words with wordnet 

From 1026 tweets, we take sample positive and negative words. The samples were 75 positive 
words and 75 negative words. The reasoned we took samples only 75 words from each positive and negative 
words was if we took a large sample of positive and negative words, the results of synonyms and antonyms 
produced by wordnet were few, because sometimes wordnet produces duplication of positive and negative 
words similar to the samples. Details of the calculation of word expansion with wordnet in the training data 
and testing data can be seen in the Table 2. 


Table 2. Expansion word by wordnet 
Expansion word by wordnet 


Corpus list positive and negative words 


Synonyms Antonyms 
Positive Word AS 117 133 
NegativeWords 75 134 115 
Total 150 251 248 


From Table 2 it could been seen the results of the expansion of the corpus listed of positive and 
negative words used synonyms and antonyms by wordnet. The total listed of positive words which previously 
was 75 positive words, after extending their meaning with wordnet, produced 117 synonyms of positive 
words and 133 antonyms negative words. The total listed of negative words which previously was 75 words, 
after extending with wordnet, produced 134 synonyms of negative words and 248 antonyms of negative 
words. So the total new positive and negative word listed from the results of the expansion of wordnet which 
previously amounted to 150 words to 499 words. 


4.3. Feature extraction 

Valued from Table 3 obtained from calculating the weight of TF and IDF between listed of positive 
and negative words from Table 3 with listed of tweets, so that positive and negative labeling could been used 
used the TF-IDF algorithm for each tweet. this labeling had been evaluated used the sentiment analysis 
classification methods. The number of positive labels in training data before used spelling checker was 233 
tweets and after used the spelling checker became 188 tweets, testing data from 231 tweets to 177 tweets. 
The number of negative labels in training data before used spelling checker was 220 tweets and after used the 
spelling checker became 262 tweets, testing data from 110 tweets to 132 tweets. The number of neutral labels 
in training data before used spelling checker was 147 tweets and after used the spelling checker became 150 
tweets, testing data from 85 tweets to 117 tweets. The results of the labeling testing data in Table 4 had been 
used as a comparison in determining the accuracy of each classifier methods used in this studied. 
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Table 3. Dataset feature extraction with TF-IDF algorithm 
Without spelling checker and wordnet With spelling checker and wordnet 


Training data Testing data Training data Testing data 
Positive Label 233 231 188 177 
Negative Label 220 110 262 132 
Neutral Label 147 85 150 117 
Total 600 426 600 426 


4.4. Expansion words with wordnet 

The results total values of true positive, true negative, false positive, and false negative for each of 
classification methods in sentiment analysis in this researched can be seen in Table 4. Table 4 also produces 
lower false negative, false positive and true negative values, because the resulting dataset contains standard 
words through the spelling checker process before preprocessing and when performing the feature extraction 
process combined with wordnet. After getting the true positive, true negative, false positive and false 
negative values, then we can calculate the accuracy, precision, recall, and F1 Score values. Calculation 
details can be seen in the Table 5. 


Table 4. Confusion matrix model 


Variable True Positive True Negative False Positive False Negative 
K-NN, K=3 149 109 235 107 
Naive Bayes 132 114 233 121 
Decision tree 174 108 258 60 
K-NN with wordnet, K=3 223 101 129 147 
Naive Bayes with checker 201 105 122 172 
Decision tree with checker 241 107 117 135 
K-NN with spelling checker, K=3 212 130 181 77 
Naive Bayes with wordnet 191 109 177 123 
Decision wree with wordnet 233 109 151 107 
K-NN using with checker and wordnet, K=3 322 110 98 70 
Naive Bayes with Spelling Checker and wordnet 319 105 105 71 
Decision tree with spelling checker and wordnet 269 181 71 73 


Table 5. Comparison of result using cross validation 


Method Precision Recall Accuracy F1 Score 
K-NN, K=3 39% 58% 43% 47% 
Naive Bayes 36% 52% 41% 43% 
Decision tree 40% 74% 47% 52% 
K-NN with Wordnet, K=3 63% 60% 54% 62% 
Naive Bayes with checker 62% 54% 51% 58% 
Decision tree with checker 67% 64% 58% 66% 
K-NN with spelling checker, K=3 54% 73% 57% 62% 
Naive Bayes with wordnet 52% 61% 50% 56% 
Decision tree with wordnet 61% 69% 57% 64% 
K-NN using with checker and cordnet, K=3 71% 82% 712% 79% 
Naive Bayes with spelling checker and wordnet 75% 82% 71% 78% 
Decision tree with spelling checker and wordnet 78% 79% 75% 78% 


From Table 5, it could been seen that the comparison of the results valued of the calculations of 
Precision, Recall, Accuracy and F1 Score in twelve classification method. The values of precision, recall, 
accuracy and f1 score could been obtained from the confusion matrix values in Table 4. From the figure, it is 
explained in outline that the values of precision, recall, accuracy and F1 score have increased after using 
spelling checker and Wordnet. This happens because the results of the confusion matrix in Table 4 produce a 
larger true positive value and a lower false positive, false negative and true negative value. 

Figure 3 depicts a diagram of the calculation results from the confusion matrix. It can be seen that 
there has been an increase in terms of accuracy for the sentiment analysis method whose training data and 
testing data have gone through word expansion using Wordnet and the data set has been cleaned of slang 
words using a spelling checker. The accuracy value can be obtained from the calculation in Table 5 which is 
represented in the form of a diagram in Figure 3. 
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Figure 3. Confusion matrix of all classification methods 


5. CONCLUSION 

The conclusion of this paper is after using a spelling checker to convert slang words into standard 
words and using wordnet as an NLP to find similar words, spelling checker and wordnet in sentiment 
analysis can improve the value of accuracy score in sentiment analysis. This happens because the values of 
false negative, false positive and true negative have decreased in the confusion matrix calculation proscess 
after using spelling checkers and after using wordnet the true positive value in the confusion matrix 
increases, because wordnet allows the feature extraction process to recognize words and their similarities. 
Based on (6)-(9) said that if the value of false positive, false negative and true negative decreases then the 
accuracy value will increase. It is explained that the calculation of this accuracy value is very dependent on 
the false positive, false negative, and true negative values. 
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